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Abstract 

We present an iterative approach to constructing pseudorandom generators, based on the 
repeated appHcation of mild pseudorandom restrictions. We use this template to construct 
pseudorandom generators for combinatorial rectangles and read-once CNFs and a hitting set 
generator for width-3 branching programs, all of which achieve near-optimal seed-length even in 
the low-error regime: We get seed-length 0(log(n/e)) for error e. Previously, only constructions 
with seed-length 0(log^^^ n) or 0(log^ n) were known for these classes with error e = l/poly(n). 

The (pseudo)random restrictions we use are milder than those typically used for proving 
circuit lower bounds in that wc only set a constant fraction of the bits at a time. While such 
restrictions do not simplify the functions drastically, we show that they can be derandomized 
using small-bias spaces. 
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1 Introduction 



1.1 Pseudorandom Generators 

The theory of pseudorandomness has given compelhng evidence that very strong pseudorandom 
generators exist. For example, assuming that there are computational problems solvable in expo- 
nential time that require exponential-sized circuits, Impagliazzo and Wigderson [IW97] have shown 
that for every n, c and e > 0, there exist efficient pseudorandom generators (PRGs) mapping a 
random seed of length 0(log(n'^/e)) to n pseudorandom bits that cannot be distinguished from 
n uniformly random bits with probability more than £, by any Boolean circuit of size n'^. These 
PRGs, which fool arbitrary efficient computations (represented by polynomial-sized Boolean cir- 
cuits), have remarkable consequences for derandomization: every randomized algorithm can be 
made deterministic with only a polynomial slowdown, and thus P = BPP. 

These results, however, remain conditional on a circuit complexity assumption whose proof 
seems far off at present. Since PRGs that fool a class of Boolean circuits also imply lower bounds 
for that class, we cannot hope to remove the assumption. Thus unconditional generators are only 
possible for restricted models of computation for which we have lower bounds. 

Bounded- depth circuits and bounded-space algorithms are two models of computations for which 
we know how to construct PRGs with 0(log'^^^^(?i/e)) seed length [Nis91, Nis92]. Known PRG con- 
structions for these classes have found several striking applications including the design of stream- 
ing algorithms [Ind06], algorithmic derandomization [Siv02], randomness extractors [TreOl], hash- 
ing [CRSWll], hardness amplification [HVV06], almost /c-wise independent permutations [KNR05], 
and cryptographic PRGs [HHR06]. Arguably, constructing PRGs with the optimal 0(log(n/e)) seed 
length for these classes are two of the outstanding open problems in derandomization. 

Nisan [Nis92] devised a PRG of seed length O(log^n) that fools polynomial-width branching 
programs, the non-uniform model of computation that captures logspace randomized algorithms: 
a space-s algorithm is modeled by a branching program^ of width 2'^. Nisan's generator has been 
used by Saks and Zhou [SZ99] to prove that every randomized logspace algorithms can be simu- 
lated in space 0(log'^^^ n), Nisan's generator remains the best known generator for polynomial- width 
branching programs (and logspace randomized algorithms) and, despite much progress in this area 
[INW94, NZ96, RR99, Rei08, RTV06, BRRYIO, BVlOb, KNPll, Dell], there are very few cases 
where we can improve on Nisan's twenty year old bound of 0(log^ n) [Nis92]. For constant-width 
regular branching programs, Braverman et al. [BRRYIO] have given a pseudorandom generator 
with seed length 0((logn) • (log(l/e))), which is O(logn) for e = l/polylog(n), but is no better 
than Nisan's generator when e = l/poly(n). Only for constant-width permutation branching pro- 
grams and for width-2 branching programs has seed length 0(log(n/e)) been achieved, by Koucky, 
Nimbhorkar, Pudlak [KNPll] and Saks and Zuckerman [SZ95], respectively. Remarkably, even for 
width-3 branching programs we do not know of any efficiently computable PRG with seed length 
o(log^n). Recently, Sima and Zak [SZll] have constructed hitting set generators (HSGs, which 
are a weaker form of pseudorandom generators) for width-3 branching programs with optimal seed 
length O(logn), for a large error parameter e > 5/6. 

In a different work, Nisan [Nis91] also gave a gives a PRG that e-fools ACq circuits of depth 
d and size s using seed length 0{log'^'^~^^ {s / e)) . For the special case of depth-2 circuits, that is, 
CNFs and DNFs, the work of Bazzi [Baz09], simplified by Razborov [Raz09], provides a PRG of seed 
length 0(logn • log^(s/e)), which has been improved to 0(log^(s/e)) by De et al. [DETTIO]. For 

^Space-bounded randomized algorithms are modeled by oblivious, read-once branching programs, which read the 
input bits in a specified order and read each input bit only once. In this paper, all the references to "branching 
programs" refer to "oblivious read-once branching programs." 
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the restricted case of read-k DNFs and CNFs, De et al. (for k =1), and Klivans et al. [KLWIO] (for 
k constant) improve the seed length to 0(loge~^ • logs), which is optimal for constant e, but it is 
essentially no better than the bound for general CNFs and DNFs when e is polynomial in 1/n. 

The model of combinatorial rectangles is closely related to both bounded-width branching pro- 
grams and read-once CNFs and are interesting combinatorial objects with a variety of applications 
of their own [ASWZ96]. The problem of constructing PRCs for combinatorial rectangles is closely 
related to the construction of small sample spaces that approximate the uniform distribution on 
many multivalued random variables [EGL+98]: they can be seen as an alternate generalization 
of the versatile notion of almost fc-wise independent distributions on {0, 1}" to larger domains 
[m]"". Versions of this problem where each coordinate is a real interval were first studied in 
number theory and analysis [ASWZ96]. Subsequently there has been much work on this problem 
[EGL+98, LLSZ97, ASWZ96, Lu02, Violl]. A PRC with seed length 0(logn + log^/^{l/e)) [Lu02] 
is known for combinatorial rectangles; such a generator achieves the optimal seed length O(logri) 
when e > 2"*-^^'°^^ but not for e = l/poly(n). It is known how to construct HSGs (which are a 
weakening of PRCs) with seed length 0(log(n/e)) [LLSZ97]. 

Indeed, there are few models of computations for which we know how to construct PRCs with the 
optimal seed length 0(log(n/e)) or even log^~^°^^\n/e). The most prominent examples are bounded- 
degree polynomials over finite fields [NN93, AGHP92, BVlOa, Lov08, VioOS], with parities (which 
are fooled by small-bias distributions [NN93]) as a special case, and models that can be reduced to 
these cases, such as width-2 branching programs [SZ95, BDVY09]. 

In summary, there are several interesting models of computation for which a poly logarithmic 
dependence on n and 1/e is known, and the dependence on one parameter is logarithmic on its 
own (e.g. seed length 0(log n log(l/e))), but a logarithmic bound in both parameters together has 
been elusive. Finally, we remark that not having a logarithmic dependence on the error e is often 
a symptom of a more fundamental bottleneck. For instance, HSGs with constant error for width 
4 branching programs imply HSGs with polynomially small error for width 3 branching programs, 
so achieving the latter is a natural first step towards the former. A polynomial-time computable 
PRG for CNFs with seed length 0(logn/e) would imply the existence of a problem in exponential 
time that requires depth-3 circuits of size 2^^"') and that cannot be solved by general circuits of size 
0{n) and depth O(logn), which is a long-standing open problem in circuit complexity [Val77]. 

1.2 Our Results 

In this paper, we construct the first generators with seed length 0(log(n/e)) (where 0{ ) hides 
polylogarithmic factors in its argument) for several well-studied classes of functions mentioned 
above. 

• PRCs for combinatorial rectangles. Previously, it was known how to construct HSGs with seed 
length 0{log{n/e)) [LLSZ97], but the best seed length for PRCs was 0(log ?i-Mog^/2(l/e)) [Lu02]. 

• PRCs for read-once CNF and DNF formulas. Previously, De, Etesami, Trevisan, and Tul- 
siani [DETTIO] and Klivans, Lee and Wan [KLWIO] had constructed PRCs with seed length 
0(logn • log(l/e)). 

• HSGs for width 3 branching programs. Previously, Sima and Zak [SZll] had constructed 
hitting set generators for width 3 branching programs with seed length O(logn) in case the 
error parameter e is very large (greater than 5/6). 
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As a corollary of our PRG for combinatorial rectangles we get improved hardness amplification in 
NP by combining our results with those of Lu Tsai and Wu [LTW07] - we refer to Section 5 for 
details^ . 



1.3 Techniques 

Our generators are all based on a general new technique — the iterative application of "mild" 
(pseudo) random restrictions. 

To motivate our technique, we first recall Hastad's switching lemma [Ajt83, FSS84, Has86]: if we 
randomly assign a 1 — l/0{k) fraction of the variables of a fc-CNF, then the residual formula on the 
n/0{k) unassigned variables is likely to become a constant. Ajtai and Wigderson [AW85] proposed 
the following natural approach to constructing PRCs for CNFs: construct a small pseudorandom 
family of restrictions that: 1) makes any given CNF collapse to a constant function with high 
probability; and 2) ensures that the CNF collapses to each constant function with the right proba- 
bility as determined by the bias of the formula. Known derandomizations of the switching lemma 
are far from optimal in terms of the number of random bits needed [AW85, AAI+Ol, GMR12]. 
We will show that, for read-once CNFs, such a pseudorandom restriction can be generated using 
0(log(m/e)) random bits. 

We apply restrictions that only set a constant fraction of the variables at a time. The novel 
insight in our construction is that although we cannot set all the bits at one go from a small-bias 
distribution, we can set a constant fraction of bits from such a distribution and prove that the bias 
of the formula is preserved (on average). Hence we use only 0(log(m/e)) truly random bits per 
phase. While such mild random restrictions do not drastically simplify the formulas, we show that 
in each phase a suitable measure of progress improves (e.g. most clauses will either be satisfied or 
will have reduced width), implying that the formula collapses to a constant after 0(log log(?n/e)) 
steps; and so the total randomness will be 0(log(m/e)). The idea of setting a few variables at 
a time is inspired by a recent PRG for hashing balls into bins due to Cells, Reingold, Segev, and 
Wieder [CRSWll]. 

We illustrate our technique below with a toy example. 



A Toy Example. Consider a read-once CNF formula / of width w with m = 2'""''^ clauses in 
which the variables appear in order (aka the Tribes function of [BL85]). That is. 



UI + l) 



1 X2w) A • • • A /m.(3;(rra— l)ui+l ) • • • ; -^mui) 



where each /j is the OR function. / has constant bias and can be computed both by a combinatorial 
rectangle and a width-3 branching program. De et al. showed that fooling this function with error 
e using small-bias spaces requires seed-length r2(i(;log(l/e)/loglog(l/e)). 

Assume we partition the input bits into two parts: x which contains the first w/2 variables of 
each clause and y which contains the rest. Let x o y denote the concatenation of the two strings. 
We would like to show that for V a small-bias distribution and lA the uniform distribution, 



E 



E [f{xoy)] 



< e 



(1.1) 



A naive approach might be to view setting y ^ U as applying a random restriction with 
probability 1/2. If this simplified the function / to the extent that it can be fooled by small-bias 
spaces, we would be done. Unfortunately, this is too much to hope for; it is not hard to see that 



We thank an anonymous referee for pointing out this application. 
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such a random restriction is very likely to give another Tribes-like function with width tL'/2, which 
is not much easier to fool using small bias than / itself. 

Rather, we need to shift our attention to the bias function of /. For each partial assignment x, 
we define the bias function F{x) as 

F{x)= K[f{xoy)]. (1.2) 

We can now rewrite Equation (1.1) as 

E [F{x)]- E [Fix)] <s (1.3) 

Our key insight is that for restrictions as above, the function F is in fact easy to fool using a 
small-biased space. This is despite the fact that F[x) is an average of functions f{x o y) (by 
Equation (1.2)), most of which are Tribes-like and hence are not easy to fool. 

Let us give some intuition for why this happens. Since f{xoy) = YYiLi fi{x o y), 

m m 

Fix)= E [f{xoy)]=l[ E [h{xoy)]=l[F,ix), 

yr^U y^U 

i=l 1=1 

where Fi{x) is the bias function of the i^^ clause. But note that over a random choice of y, fi{x) is 
set to 1 with probability 1 — 2~"'/^ and is a clause of width w/2 otherwise. Hence 

F,{x)= E [f,{x oy)] = l-J-+ ^■=i^"'(^-i)+J- 

As a consequence, over a random choice of x, we now have 

w.p. 1 - 2-"'/2 
w.p. 2-"'/2 



F^(X) 




Thus each Fi{x) is a random variable with Ex[Fi{x)] = 1 - and \/arx[Fi{x)] ^ 2~^"'/^. In 
contrast, when we assign all the variables in the clauses at once, each fi{x) behaves like a Bernoulli 
random variable with bias 1 — . While it also has E2;[/j(2;)] = 1 — 2~"', the variance is much 
larger: ydirx[fi{x)] ~ 2""^. The qualitative difference between 2~^"'/^ and 2""^ is that in the former 
case, the sum of the variances over all 2*"+^ clauses is smah (2-"'/2)^ but in the latter it IS more 
than 1. We leverage the small total variance to show that small-bias fools F, even though it does 
not fool / itself. Indeed, setting any constant fraction q < 1 of variables in each clause would work. 

We now sketch our proof that small-bias spaces fool F. Let gi{x) = Fi{x) — (1 — 2~'^) be Fi 
shifted to have mean 0, so that Ex[gi{x)'^] = Var[Fj(3;)]. We can write 



m 



Fix) = 11(1-2-'" + g,ix)) = ^ CkSkigiix), . . . , gmix)) (1.4) 

1=1 k=l 

where Sk denotes the k^^ elementary symmetric polynomial and Ck € [0, 1].^ 



'^In the toy example we are currently studying, an alternative and simpler approach is to write Fi{x) = (1 — 
2~-w/2y~hi(x) ^ -^jjgj.g tii{x) = V™f ^a:;„(i_i)+j is the indicator for whether x already satisfies the i'th clause on its own. 
Then F{x) = Yli Pi{^) expands as a power series in — hi{x) — 2~™^'^), and higher moment bounds can be used 

to analyze what happens when we truncate this expansion. However, this expansion is rather specific to the highly 
symmetric Tribes function, whereas we are able to apply the expansion in terms of symmetric polynomials much 
more generally. 
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Under the uniform distribution, one can show that 

E [ \SMx), . . . ] < {J2 \ H^)'] ^ 

\i=l J 

Thus for k > 0{{\ogn) /w)^ we expect each term in the summation in Equation (1.4) to be 
l/poly(n). So we can truncate at d = 0{{[ogn) /w) terms and retain a good approximation under 
the uniform distribution. 

Our analysis of the small-bias case is inspired by the gradually increasing independence paradigm 
of Cells et al. [CRSWll], developed in the context of hashing. Every monomial in the giS of degree 
at most d depends on at most wd = O(logn) variables. A small-bias space provides an almost 
0(logn)-wise independent distribution on the variables of x, so the 5(j(x)'s will be almost (i-wise 
independent. This ensures that polynomials in gi{x) , . . . , gra{x) of degree at most d (such as 
Si, ... , Sd) will behave like they do under the uniform distribution. But we also need to argue that 
the Sfc's for k > d have a small contribution to E^j^x) [-^(2;)]. 

Towards this end, we prove the following inequality for any real numbers zi, . . . , z^: 

If \Si{zi,. . . ,z,n)\ < 2 and \S2iz1, . . . ,Zm)\ < —, then \Sk{zi, . . . , Zm)\ < fJ.^- 

The proof uses the Newton-Girard formulas (see [CLO07]) which relate the symmetric poly- 
nomials and power sums. This lets us repeat the same truncation argument, provided that 
Si{gi{x), . . . ,gm{x)) and S2{gi{x), . . . ,gmix)) are tightly concentrated even under small-bias dis- 
tributions. We prove this concentration holds via suitable higher moment inequalities.^ 

This lets us show that small bias fools F{x). By iterating this argument log'u; times, we get a 
PRG for / with polynomially small error and seed-length 0{{logn){\ogw)) = 0((log ?i)(log log n)). 

Read-Once CNFs. The case of general read-once CNFs presents several additional challenges. 
Since we no longer know how the variables are grouped into clauses, we (pseudo)randomly choose 
a subset of variables to assign using e-biased spaces, and argue that for most clauses, we will not 
assign few variables. Clauses could now have very different sizes, and our approximation argument 
relied on tuning the amount of independence (or where we truncate) to the width of the clause. We 
handle this via an XOR lemma for e-biased spaces, which lets us break the formula into 0(log log n) 
formulae, each having clauses of nearly equal size and argue about them separately. 

Combinatorial Rectangles. A combinatorial rectangle / : [W]^ {0, 1} is a function of the 
form f{xi, . . . , Xm) = ^iLifi{xi) for some Boolean functions /i, . . . , fm- Thus, here we know which 
parts of the input correspond to which clauses (like the toy example above), but our clauses are 
arbitrary functions rather than ORs. To handle this, we use a more powerful family of gradual 
restrictions. Rather than setting w/2 bits of each co-ordinate, we instead (pseudo)randomly re- 
strict the domain of each xi to a set of size W^^"^. More precisely, we use a small-bias space to 
pseudorandomly choose hash functions hi, ... ^ hm ■ [W^^"^] — )• [W] and replace / with the restricted 
function f'{zi,...,Zm) = A^i(/i ohi){zi). 

^These inequalities actually require higher moment bounds for the gt's. We ignore this issue in this description 
for clarity, and because we suspect that this requirement should not be necessary. 
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Width 3 Branching Programs. For width 3 branching programs, inspired by Sima and Zak [SZll] 
we reduce the task of constructing HSGs for width 3 to that of constructing HSGs for read-once CNF 
formulas where we also allow some clauses to be parities. Our PRG construction for read-once CNFs 
directly extends to also handle such formulas with parities (intuitively because small-bias spaces 
treat parities just like individual variables). The first step of our reduction actually works for any 
width d, and shows how to reduce the the task of constructing HSGs for width d to constructing 
hitting set generators for width d branching programs with sudden death, where the states in the 
bottom level are all assumed to be Reject states. 

Organization. Section 2 gives some preliminaries on pseudorandomness. Section 3 develops our 
main new technical tools for constructing sandwiching approximators for symmetric functions. We 
prove an XOR Lemma for e-biased spaces in Section 4. 

Section 5 describes our PRG construction for combinatorial rectangles. The reduction from 
hitting sets for width 3 branching programs to hitting sets for CNFs with parity is in Section 6. 
The generator for read-once CNFs and for CNFs with parity are presented in Section 7 and Section 8 
respectively. 

2 Preliminaries 

We briefly review some notation and definitions. We use x ~ 2? to denote sampling x from a 
distribution T>. For a set S, x ^ S denotes sampling uniformly from S. By abuse of notation, 
for a function G : {0, 1}* — >■ {0, 1}" we let G denote the distribution over {0, 1}" of G{y) when 
y ~ {0, l}^ For a function / : {0, 1}" ^ M, we denote E[f] = E^^|o,i}n [/(x)]. 

Hitting Set Generators and Pseudorandom Generators. 

Definition 2.1 (Hitting Set Generators). A generator G : {0, 1}'' — )• {0, 1}" is an (e, (5)-hitting set 
generator (HSGj for a class C of Boolean functions if for every f € C such that E[/] > e, we have 
Kxr^G f{^) ^ ^- W^e refer to r as the seed-length of the generator and say G is explicit if there is 
an efficient algorithm to compute G that runs in time poly(n, 1/5). 

Typically, our hitting set generators will be (e,(5) generators for some 5 = poly(e, 1/n). Given 
two functions h : {0, 1}" {0, 1} we say g <h \i g{x) < h{x) for all x G {0, 1}". To prove that 
G hits h, it suffices to show G hits some function g < h. 

Definition 2.2 (Pseudorandom Generators). A generator G : {0, 1}^ — ?> {0, l}" is an e-pseudorandom 
generator (PRG) for a class C of Boolean functions if for every f G C, |E[/] — EG[/(y)]| < e. We 
refer to r as the seed-length of the generator and say G is explicit if there is an efficient algorithm 
to compute G that runs in time poly(n, 1/e). We say G e- fools C and refer to e as the error. 

We shall make extensive use of small-bias spaces, introduced in the seminal work of Naor and 
Naor [NN93]. Usually these are defined as distributions over {0, 1}", but it is more convenient for 
us to work with {±1}". 

Definition 2.3. A distribution V on {±1}"' is said to he e-biased if for every nonempty subset 

There exist explicit constructions of e-biased spaces which can be sampled from with 0(logn-|- 
log(l/e)) random bits [NN93]. These give efficient pseudorandom generators for the class of parity 
functions. 
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Definition 2.4. Let < a,S < 1/2. We say a distribution on V on 2^""^ is 6-almost independent 
with bias a if I ^T> satisfies the following conditions: 

• For every i G [n], P[i € /] = a. 

• For any distinct indices ii, . . . ,ik G [n] and hi, . . . ,bk ^ {0, l}'^', 

k 



A';=,{l{^J G ^) = bj) ] = n ^ I) = bj] ± S. 



There exist explicit constructions of distributions in D as above which only need 0(logn + 
log(l/a5)) random bits [NN93]. We will write I V{a,d) for short whenever / is sampled from a 
5-almost independent distribution with bias a as above. 



Sandwiching Approximators. One of the central tools we use is to construct sandwiching 
polynomial approximations for various classes of functions. The approximating polynomials {Pi, Pu) 
we construct for a function / will have two properties: 1) low-complexity as measured by the "Li- 
norm" of P£,Pu and 2) they "sandwich" f,Pu1^f^ Pu- The first property will be important to 
argue that small-bias spaces fool the approximating polynomials and the second property will allow 
us to lift this property to the function being approximated. We formalize these notions below. For 
notational convenience, we shall view functions and polynomials as defined over {±1}". 

Definition 2.5. Let P : {ztl}" M be a polynomial defined as P[x) = X]/c[n] ^iWa^i ^i- Then, 
the Li-norm of P is defined by \-i[P] = ^/c[n] l^-^l- f '■ i^^}^ ~^ ^ ^'^^ (5-sandwiching 

approximations of Li norm t if there exist functions fu, fe '■ {il}" ~^ such that 

fi{x) < fix) < fu{x) Vx, E[fu{x)] - E[fe{x)] < 6, Li(/,), Li(/„) < t. 

We refer to fe and fu as the lower and upper sandwiching approximations to f respectively. 

It is easy to see that the existence of such approximations implies that f is 5 + te fooled by 
any e-biased distribution. In fact, as was implicit in the work of Bazzi [Baz09] and formalized in 
the work of De et. al. [DETTIO], being fooled by small-bias spaces is essentially equivalent to the 
existence of good sandwiching approximators. 

Lemma 2.6. [DETTIO] Let f : {±1}" R be a function. Then, the following hold for every 
0<e <5: 

• If f has 6-sandwiching approximations of Li-norm at most 6/e, then for every e-biased dis- 
tribution V on {±1}", \ ¥.:,^v[f{x)] -IE[/]| < 5. 

• If for every e-biased distribution V, \'&xr^'£>[f {x)] — < 5, then, f has [25) -sandwiching 
approximations of L\-norm at most \ + (5 + . 

^ De et al. actually show a bound of (5/e on the Li norm of the sandwiching approximators excluding their constant 
term. But it is easy to see that the constant term of the approximators is bounded by | E[/]| + 5. 
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Pseudorandom Generators for CNFs. A Conjunctive normal form formula (CNF) is a con- 
junction of disjunctions of literals. Throughout we view CNFs as functions on {±1}"", where we 
identify —1 with false and 1 with true. We say a CNF / = Ci A C2 A • • • A Cm is a read-once CNF 
(RCNF), if no variable appears (by itself or as is its negation) more than once. We call m the size 
of / and the maximum number of variables in Ci, . . . ,Cm the width of /. We shall also use the 
following results of [DETTIO], [KLWIO] which say that RCNFs with small number of clauses have 
very good sandwiching approximators. 

Theorem 2.7. Let f : {±1}" — > {0, 1} be a RCNF with at most m clauses. Then, for every e > 0, 
/ has e- sandwiching polynomials with Li-norm at most m'^^^^^^^^'^^h 

Theorem 2.8. Let f : {±1}'" {0, 1} be a CNF with at most m clauses and width at most w. 
Then, for every e > 0, f has e-sandwiching polynomials with Li-norm at most {m/e)^^^^"^^\ 



3 Sandwiching Approximators for Symmetric Functions 

For /c > 1, let '■ — >• K denote the k^^ elementary symmetric polynomial defined by 

Sk{zi,...,Zr,i) = ^ JJ^j. 

7C[m.],|/|=fc ie/ 

Our main result on sandwiching approximators for symmetric functions is the following: 

Theorem 3.1. Let gi, . . . ,gm '■ {il}" K 6e functions on disjoint sets of input variables and 
o"!, (T2, . . . , 0"m be positive numbers such that for all i € [m], 

n9^] = 0, \-i[gi] <t, E ^ [ig,f'] < {2kf^<jf for k > 1. 



Let (T^ = (^. crf)/m and 5 € (0, 1) and £,k > be such that 



log(l/(5)25 ' 



k 



51og(lA) 
\og{l /ma^] 



5^ 



{mt + 1)2'= ' 



(3.1) 



LetP{x) = Y^iLo'^i'^ii9ii^)i ■ ■ ■ j9m{x)) be a symmetric multilinear function of the giS that computes 
a bounded function P : {±1}" — > [—B, B], with |cj| < C for all i £ [m]. Then, 

1. For every e-biased distribution T), we have 



E [P{x)] - E [P(x)] 

a;~{±l}" x^V 



< 0{B + C)5. 



2. P has 0{B + C)5 sandwiching approximations of Li norm 0{{B + C){mt + 1)2'=(5 '^). 

As an illustration of this theorem, we state the following immediate corollary which formalizes 
the argument for the toy example in the introduction. 

Theorem 3.2. Let k > be a constant. Let gi,. . . ,gm ■ ~^ functions on disjoint 

sets of input variables with E[gi] = 0, \-i[gi] = 0(1) and a < l/m~^/2-K_ p . {±1}" [-1, 1] 
be a symmetric polynomial in gi 's of the form P{x) = YlT^o '^i^iigi, ■ ■ ■ ,gm), with \ci\ < 1. Then, 
for every 5 £ (0, 1), with log(l/o") > $7^ (log (1/(5)), P has 6-sandwiching polynomials of Li-norm at 
most poly(l/(5). 
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To derive Theorem 3.2 from Theorem 3.1, observe that in the notation from Section 1.3, m = 
2"^+^ PS 2-3"'/^ and all the other conditions hold. 

In the rest of this section, we prove the first statement of Theorem 3.1. The second statement 
follows from the first by Lemma 2.6. We first sketch the steps involved in the proof. 

Let k, £ be as in the theorem and let P be a e-biased distribution. Let P</(. = X]f=o ^i^iidi, ■ ■ ■ ■, 9m)- 
We will prove the theorem by showing that P cannot distinguish the uniform distribution from V 
by a series of inequalities: 

E [P{gi{x),...,gm{x))]^s E [P<fc(gi(x), . . . , 

a:~{±l}" x~{±l}" 

Of these, the second inequality will follow from the fact that Li[P<fc] = poly(l/(5) (this is not 
too hard). The first inequality can be seen as a special case of the last inequality as the uniform 
distribution is an e-biased distribution for any e. Much of our effort will be in showing the last 
inequality. 

To do this, we first show that there is an event £ that happens with high probability under 
any e-biased distribution, and conditioned on which P<fc is a very good approximation for P. We 
then prove the last inequality by conditioning on the event £ and using Cauchy-Schwarz to bound 
the error when £ does not occur. The event £ will correspond to \Si{gi, . . . ,5™)!) |5'2(5i, • • • ■,gm)\ 
being small, which we show happens with high probability using classical moment bounds. Finally, 
we show that P<_k approximates P well if £ happens by using the Newton-Girard Identities for 
symmetric polynomials (see Lemma 3.6). 



3.1 Proof of Theorem 3.1 



Our first task will be to show that under the assumptions of the theorem, | gi{x) \ and | gi{x)'^\ 
are small with high probability. We do so by first bounding the fc'th moments of these variables 
and applying Markov's inequality. For this we will use Rosenthal's inequalities ([Ros72], [JSZ85], 
[Pin94]) which state the following: 

Lemma 3.3. For independent random variables Zi, . . . , such that E[Zj] = 0, and all A; € N, 



E 



2k 



<{2k)'''maxij2nzf%[J2^[Zf 



1=1 



.i=l 



For independent non-negative random variables Zi, . . . , Z^., and all A; € N, 



E 



vi=l 



< 



k''max{J2nzhAJ2^[z^] 



i=l 



.i=l 



Lemma 3.4. For all integers k > 2, 



E 

a;~{±l}" 



E 



2k 



\i=l J 
' m 



,i=l 



<i2kri^a^ 
\i=i 

/ m 

<{2kf'{Y^a! 



j=i 



(3.2) 



(3.3) 



(3.4) 
(3.5) 
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Proof. Let Zi = gi{x), x ~ {±1}". Then, Zj's are independent mean-zero variables. Now, by 
Rosenthal's inequality, Equation (3.2), 



2k 



The second bound follows similarly by applying Rosenthal's inequality,Equation (3.3), to the non- 
negative random variables Zf = gf: 



E 



□ 



A consequence of Lemma 3.4 is the following: 
Corollary 3.5. For all k > 2, under any e-biased distribution T> 

^ m \ 2fc 



E 



E 



\i=l 



< {2kf^{ma'^f + e{mtf^ 

< {2kf^{ma'^f + e{mt^f 



(3.6) 
(3.7) 



Proof. Note that for any function h : {±1}" — ?• M, Li[/i'^] < (Li[/i])'^. Therefore, applying this 
inequality to h = J2i9i^ S^* ^if (^^ fifi)^^ ] < {mt)"^^ . The first inequality now follows from 
Lemma 3.4 and Lemma 2.6. The second inequality follows similarly. □ 

Next we show that IX^jS'il, Ylidl being small implies the smallness in absolute value of 
Sk{gi, . . . ,gm) for every k>2. Note that there is no probability involved in this statement. 

Lemma 3.6. Let zi, . . . ,Zm be real numbers that satisfy 



i=l 



1=1 



Then for every k > 2 we have 



\Skizi,. . . ,Zm)\ < ^J}'■ 
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Proof. To prove this lemma, we first bound the power sums Ek{zi, . . . , Zm) which are defined as 

m 

Ek{zi,...,Zm) = ^zt 

i=l 

Note that Ei = Si. We start by bounding E/i for k > 2 using the norm inequalities 

(m \ i / m \ ^ 

Hence we have \Ek{zi, . . . , Zm)\ < fJ-^- 

The relation between the power sums and elementary symmetric polynomials is given by the 
Newton-Girard identities (see [CLO07], Chapter 7.1 for instance) discovered in the 17th century. 

1 

Skizi,. . . ,Zm) = -j;'^{-iy ^Sk-i{zi,. . . ,Zm)Ei{zi,. . . ,Zm)- (3.8) 

1=1 

We use these to show by induction on k that |5fc| < fi^. For k = 2, we have 

S2{zi, . . . ,Zm) = ^{Slizi, . . . , Zraf - -£'2(2:1, . . . , Zm)) < ^(/i^ + A*^) < ■ 

Assume we have proved the bound up to — 1. Using the Newton-Girard formula, 

—i \Z\ , . . . , ZjYi 

)\\Ei{zi, . . . , Zm)\ < f^'^ ''l^''<IJ-^- 



/ ^ 1"-"^, — ■ ■ ■ 7~//t/||"fcV J-5 " "'lit J I — ^ 

i=\ i=\ 



□ 



Let 

k 

P<kix) = ^CiSi{gi, ...,gm) 

denote the truncation of P to degree k. We use the following bounds for P<k- 
Lemma 3.7. Let P,m,t,C,T> be as in Theorem 3.1. If ma"^ < ^, then for every /c G N, 

E [P<fc(x)2] <2C^ + e- {mt + 1)^'= • C^. (3.9) 

Proof. We observe that the symmetric polynomials Sq = l,...,Sk on gi,...,gm are mutually 
orthogonal under the uniform distribution, i.e., for i ^ j, 

E [Si{gi{x), . . . ,gm{x)) ■ Sj{gi{x), . . .,gr,i{x))] = 0. 
For brevity, we shall omit writing out the argument x in the following. For i > 1, we have 



ma ) . 
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Therefore, assuming that ma'^ < 1/2, 



x~{±l}" — a;~{±l}" — 

1=0 1=0 



Since \-i[gj\ < t, we have 



Hence 



Li [Siigi, . . .,gm)] < . ]t 



m 



Li [P<k] <CY^ f <C-{mt + l)^ and 



Li [P|fc] < Li [P<k? < ■ (mt + i; 



2k 



E [P<k{xr] < C\2 + e(mt + 1)"'^) <2C'+e- {mt + 1)'*=) • C . 



(3.10) 



□ 



Setting Parameters. In Theorem 3.1, we choose 

51og(l/<^) 



k 



log(l/mcr^] 



which guarantees 6^/2 < {ma'^)^ < S^. By Equation (3.1) we have 



ma < 



log(l/(5)25^ 



from which it follows that 



, < J^MML^, and 



{2k)^'' < 7- 



51oglog(l/5) 
1 



(3.11) 
(3.12) 



Finally, for all e small enough so that e • {mt + 1)^^ < 6^, the following bounds will hold under the 
assumptions of Theorem 3.1, by Corollary 3.5 and Lemma 3.7, 



E 



E 



E^ [P<k{xf] < 4C7^ 



x^V 
/ m 



Ki=l 



^9i{^ 



-.1=1 



< {2k)^''{ma^)'' + e{mtf^ < 26^ 

< {2kf^{ma'^f + e{mtf^ < 25^. 



(3.13) 
(3.14) 

(3.15) 



We now proceed to prove Statement (1) in Theorem 3.1, which we restate below with specific 
constants. 
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Lemma 3.8. With the notation from Theorem 3.1, we have 



E [P{x)]- E [P(x)] 

a;~{±l}" x^V 



< {4B + 13C7) • 6. 



Proof. We will show that under any e-biased distribution D, 

E [\P{x) - P<kix)\] < {2B + QC)5. 



(3.16) 



(3.17) 



Note that U is e-biased for e = 0, so the above bound applies to it. We derive Equation (3.16) from 
Equation (3.17) as follows: 



E [P{x)]- E [P{x)] 

x~{±l}" x^V 



< 



E [P{x)]- E [P<,ix)] 
a'~{±i}" x~{±i}" 



+ 



E [P<fc(x)] - E [P<fc(x)] 

X^{±1}'^ - Xr^V - 



+ 



\[P<k{x)]- \[P{x)] 

3-~7? x^T> 

(3.18) 



The first and last terms are bounded using Equation (3.17). We bound the middle term by 
I E [P<k{x)] - E [P<kix)]\ < £ ■ Li[P<kix)] <e-C-imt + l)^ < C5\ 

a;~{±l}" x^V 

Equation (3.16) follows by plugging these bounds into Equation (3.18): 

E \P(x)] - E [P(x)] < 2(2B + 6C)6 + C6^ < (4B + 13C7)5. 

a;~{±l}" x^V 

We now prove Equation (3.17). Define a good event G C {±1}'" containing those x for which 
the following bounds hold: 



m. 

^9iix) 

i=l 



Y.^9^{x)f 



i=l 



2 

< 5k. 



(3.19) 



For X € G, P<k{x) gives a good approximation to P(x). By Lemma 3.6, we have \Si{gi{x), . . . , gm{x))\ < 
b^l^ for ah i>2. Hence, for all x G G 



\P(x)-P<u(x)\< \ciS,{g^{x),...,g^(x))\<C ^ 5^/'= < C5 5^/^= < 2C5. (3.20) 



We now bound the probability of -iG using Markov's inequality applied to a fc'th moment bound 
obtained from Equations (3.14) and (3.15): 



Pr 

x^T) 



Pr 

Xr^V 



^di^x) 

m 



1=1 



Pr 

x^V 



Pr 

x^V 



^9iix) 

i=l 
m 



2k 



1=1 



>5' 



>5' 



5^ x^v 



0^ x^V 



2k 



^9i{x) 

i=l / 
m 



Ki=l 



<26' 



< 26^ 
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Let Ig(^) and l^ci^) denote the indicators of G and -iG respectively. We have 



E [Ug{x)] < Pr 



i=l 



+ Pr 

Xr^V 



i=l 



(3.21) 



Further, 



E [\P{x)-P<k{x)\]= K [\P{x)-P<k{x)\-lGix)]+ E [\P{x)-P<k{x)\-UG{x)] (3.22) 

x^D x^D x^D 



By Equation 3.20, we have 



E [\P{x) - P<k{x)\ ■ 1g{x)] < max|P(x) - P<k{x)\ < 2C6 



To bound the second term. 



(3.23) 



E [\P{x)-P<k{x)\-UG]< E [\P{x)\-Ug]+ E l\P<k{x)\-UG] 

X'~^D x^D x^D 



where we use the bounds 



< E [P{xjr^ E [1.g]^+ E [P<k{x)^]"2 E [Ug]'^ 

x^D X'~i3 xr^D x^D 



<B -26 + 20 -25 



E [P{xy] < (Since \P{x)\ < B) 

x^V 

E [P<k{xf] < (Equation (3.13)) 



E [Ug] < 45 . (Equation (3.21)) 
Plugging Equations (3.23) and (3.24) into Equation (3.22) we get Equation (3.17). 



(3.24) 



□ 



4 An XOR Lemma for e-biased spaces 

In this section, we prove an XOR Lemma that helps us show the existence of good sandwiching 
approximators for the composition of a function on few variables with functions on disjoint sets of 
variables, each of which have good sandwiching approximators. We call it an XOR lemma, since 
one can view it as a generalization of Vazirani's XOR lemma. 

Theorem 4.1. Let f^,...,/'' : {±1}'" — )■ [0,1] be functions on disjoint input variables such that 
each /* has e- sandwiching approximations of Li norm t. Let H : [0, l]'^ — > [0, 1] be a multilinear 
function in its inputs. Let h : {±1}" [0,1] be defined as h{x) = H{f^{x),... ,f^{x)). Then h 
has {16^ e) -sandwiching approximations of Li norm {t + 1)^ . 

Proof. For S C [k] define the monomial 

M'ix) = l[f\x)ll{l-f\x)). 

ies j^s 

Let and denote the upper and lower sandwiching approximations to /*. Then we have 

f^{x)>f\x), E [flix) - f\x)] < e. 

X~{±1}" 

1 - fiix) > 1 - flix), E [(1 - fiix)) - (1 - f^{x))] < e. 
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Hence, if we define 



M!{x) = l[mx)l[{l-fl{x)), 

then we have 

M^{x) > M^{x) y x£ {±ir, 
L,[m!] = n Li[/:] n - //] <ii + !)'• 

We will show using a hybrid argument, that 

E [M^{x)-M^ix)]<2''e. 

For simplicity, we only do the case S = [k]. We define a sequence of polynomials = Mq, Mi . . . , 
where 

i k 

M,{x)=\{p{x) n fii^)- 

j=l j=i+l 

We now have 



E [M,(x)-M,+i(x)]= E 

x~{±l}i a-~{±l}" 



x~{±i} 
j fc 



j=i+2 



j=l j=i+2 

where we use the facts that K^^^j^iyilf^] < 1 and E,^^^±iyn[fi] < E^^^±iyn[f^] + e < 1 + e. We 
now have 

fc-i 

E [M^{x) - M'{x)] <Y, E mix) - M,+i(x)] 

<eil + il + e)---il + e)''-^)<2^e. 
To construct a lower-sandwiching approximator, we observe that 

5^ M'{x) = ii{nx) + i-r{x)) = i. 

SC[k] iG[fc] 

M/(x) = 1-5] Mj(x) 



Hence if we define 
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then 

M/(x) < 1 - X] M'^ix) = M^{x), 
E [M'ix) - Mf (x)] = M^i^) - M'^i^) < 
Li [Mf] < 2'=(t + 1)^ 

Finally, let I5 G {0, l}'^' denote the indicator vector of the set S. Since H is multilinear, we can 
write 

H{y)= ^ H{ls)llyill{l-yj) 

scik] ies j^s 

where H{ls) € [0,1]. Hence 

h{x)= H{ls)llMx)ll{l-fj{x))= Y H{ls)M' 



X) 

SC[k] iGS j^S SC[k] 



We define the polynomials 



It follows that 



huix) = Y H{ls)M^{x), he{x) = Y H{1s)m!{x). 

SC[fc] SQ[k] 



hu{x) > h{x) > hi{x) 



-{±1}" ^^^^ -{±1}" 

Li[/i„] <2'=(^ + l)^ UN <4'=(t + l)'=. 



□ 



5 A PRG for Combinatorial Rectangles 

We start by defining combinatorial rectangles (CRs). 

Definition 5.1. A combinatorial rectangle is a function f : ({±1}'")™' — )■ {0, 1} of the form 
f{xi, ... , Xm) = t\lLi fi{xi), where fi : {zbl}'^ {0, 1}, and each Xi € {±1}^ .We refer to the fiS 
as the co-ordinate functions of f . We refer to m as the size^ of f and w as the width. 

We construct an explicit PRG for CRs with seed-length 0(logm -|- w; -|- log(l/5)). The previous 
best construction due to Lu had a seed-length of 0(log?n + w + log^/^(l/(5)) [Lu02]. 

Theorem 5.2. There is an explicit pseudorandom generator for the class of combinatorial rectan- 
gles of width w and size m with error at most 6 and seed-length 0{{logw){log{m) +w + log{l/5)) + 
log(l/,5) log log(l/(5) log log log(l/5)) . 



This is usually referred to as the dimension in the literature; we use this terminology for the CNF analogy. 
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Our generator uses a recursive sampling technique and we next describe a single step of this 
recursive procedure. For this informal description suppose that 6 = l/poly(m), w = O(logm) and 
let V = 3u;/4. Fix a CR / : ({±1}'")™ ^ {0, 1}. 

Consider the following two-step process for generating a uniformly element x from ({±1}"')™'. 

• Choose a sequence of multi-sets Si, ... , Sm ^ {il}"" each of size 2"" by picking 2"" elements 
of {±1}"' independently and uniformly at random. 

• Sample Xi ~ Si and set x = (xi, . . . , Xm). 

This results in an x that is uniformly distributed over ({±1}'")™. We will show that the Ea;[/(x)] 
will not change much, even if the sampling in the first step can is done pseudorandomly using a 
small-bias space for suitably small e. 

Our final generator is obtained by iterating the one-step procedure for T = O(loglogm)) steps: 
At step t we choose multi-sets S\ C S\~^ , . . . , C S^^^ each of cardinality exactly 2^^/^^ using 
small-bias. After T steps, we are left with a rectangle of width w = O(loglogm). Such rectangles 
can be fooled by e-bias spaces where e = l/77t*^('°g'°s™). The total randomness used over all the 
steps is 0((log?n) • (log log m)). 

5.1 Sandwiching Approximations for Bias Functions 

In the following, let / be a CR of width w and coordinate functions fi, ■ ■ ■ , fm '■ {±1}"' {0, 1}. 
We describe a restriction of / which reduces the width from w to v = Sw/A. 

• For every a G {±1}", we sample string Xa = (xa,i, • • • ,Xa,m) {{±1}"'}™. 

• For i G [m], we define restricted co-ordinate functions on inputs yi by fi{yi) = f{xy^^i). 

• Define the restricted rectangle f" : ({±1}")"^ — > {0, 1} on yi, . . . , ym by 

m 

r(yi,...,ym) = A/n?/0 (5.1) 

1=1 

Let X € {{±l}"'}2"xm denote the matrix whose rows are indexed by a € {±1}'', the columns by 
i € [m] and (a, i)'th entry is given by x[a, i] = Xa^i € {ztl}'". Every such matrix defines a restriction 
of /. We will show that if choosing x from an e-biased space for e = l/poly(m) suitably small, and 
from the uniform distribution have almost the same effect on /. For i G [m], let x[i] denote the 
i'th column of x. For each coordinate function fi, define the sample average function 

m = ^ E fi(^a,)= E [/^a)]. (5.2) 
Note that each fi only depends on column z of x. Define the bias function of x as 

m 

F{x)=llm=^^^E^^^^Jf^iy)]^ (5.3) 

The main lemma of this section shows that this bias function can be fooled by small-bias spaces. 

Lemma 5 . 3 (Main) . Let F be as defined in Equation (5.3). A ssume that 6 < 1/4 and w < log ( 1 /5) , 
V = 3w/4: > 50 log log(l/5). Then F{x) has 6 -sandwiching approximations of Li norm poly(l/(5). 
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We start by stating two simple claims. 
Claim 5.4. For the sample average functions fi defined as in Equation (5.2), we have 

Li(/.) < Li(/,) < 2-/2. 

mm] = E [u{z)]. 

Proof. From Equation (5.2), it follows that 



Li[/.]<^ E Li[/,] = Li[/,]<2W2 



ae{±l}" 

where the last inequality holds for any Boolean function on w input bits. The bound on the 
expectation follows directly from Equation (5.2). □ 

The justification for the name bias function comes from the following lemma. 

Claim 5.5. For f" and F as defined in Equation (5.1) and Equation (5.3), 



E \f"{y)]=F{x). 



Proof. Note that 



hence E 



fiiVi) = ly=afiiXa,i), 
ag{±l}" 



ae{±l}" 



It follows that 



E [r(y)l = E 



E 

yH{±l}-r 



i=l 
m 



i=l 



n E [/^(y)] 

m 



i=l 

F{x). 



We will prove Lemma 5.3 by applying Theorem 3.1 to the functions gi : {{±1}'"}^" 
as follows: gi{x) = {fi{x) -pi)/pi„ where pi = E„_|±i}» [/^(x)]. (We assume pi / 0.) 

We will need the following technical lemma, which helps us show that the functions gi satisfy 
the moment conditions needed to apply Theorem 3.1. For brevity, let U denote ({zbl}'')^ ^"^ in 
the remainder of this section. 



□ 

defined 
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Lemma 5.6. Letpi,gi,'U be defined as above. We have Ej.^t/[5j(x)^'^] < {2k)'^^af^ where 

r(i=^ /or €[2-^/10,1/2], 
I toil forp,e[l/2^-2-% 
I /or K G [l-2-^l]. 

Proof. W start by bounding the moments of {fi{x) —pi). We have 

2^{m-pi)= U{Xa,i)-P^) 

which is the sum of 2"" i.i.d p^-biased random variables with mean 0. Hence we can apply Rosenthal's 
inequality (Equation (3.2)) to get 

(2^)2^^ E [(/,(x) -p.i)''^] < i2kf'm^x (2-((l -p,)2^p, +pf (1 -p,)), (2>,(1 -p,))' 

Hence we have 

x^U p. x^U 



(2^ 



<^max(2M(^) P. + 1-P.),(2 



We will use the following bounds 

r (('-^Tp. + 1-p] < l^^""'^" ' ^/^]- 

VV Pi J - [2^+1(1 forp,G [1/2,1]. 

(^2^'.i-^) <{2^+\l-p,))' forp,e [1/2,1]. 

From this it follows that ^x'^uidii^)'^^] < {2kY^af^ where 

rliiM for KG [2-/10,1/2], 
al=hSkpl for KG [1/2, 1-2--], 
I2I7 forpi G [l-2-^l]. 

□ 

Proof of Lemma 5.3. We first show the claim under the assumption that E[/] = p > 5 and later 
show how to get around this assumption. Define the sets 

5i = {i : K G (0, 2--/^^]}, S2 = {i■.p^e (2-^/10, 1 - 2"'']}, S3 = {i : pi G (1 - 2-^ 1]}. 

For j G [3], let Fj{x) = Yli^s fii^) s° F{x) = 11^=1 -^i(^)- We will construct sandwiching 
approximations for each Fj and then combine them via Theorem 4.1. We assume without loss of 
generality that Pi < 1 — 2~^ . Else, the i'th coordinate has bias 1 and can be ignored without 
changing the rest of the proof. 
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Sandwiching Fi. We show that Li[Fi] is itself small. Observe that 6 < p = YliLiPi — YlieSi — 
2-f|5i|/io^ which implies that l^ij < 10 log(l/(5)/t'. Thus, by Claim 5.4, 



5w/v 



ieSi 



\ bw/v ^ 
> < 



Sandwiching F2. Note that F2{x) = Yl^^g^ fi{x) = nje52 ' ~^ 9i{^))- Notice that F2 is a 
symmetric polynomial in the (jj's, so we will obtain sandwiching polynomials for F2 by applying 
Theorem 3.1 to c/i's. As before, 6 <p < HieSa K < (1 - 2-'')l'^2|^ go have |S'2| < 2^1og(l/(5). 
Further we can write 5 <p< UieS2('^~('^~Pi)) - e" ^»6S2(i-pO^ so that T,ieS2('^~Pi) - 21og(l/(5). 
By Lemma 5.6, we have E^^{±^u[gi{xf''] < {2kY^af^, where 2/2^' < = {1- pi) /2^''/^^ for every 
i G 82- Hence, 



I -Pi < 21og(lA) ^ 1 

29^10 - 29^10 - log(l/5)25 ■ 



ieS2 ieS2 



Hence Theorem 3.1 implies the existence of 0{5) {B = 1 and C = HieSa ^* — sandwiching 
approximations with Li norm bounded by {mt + l)"^^ where 

m = |52|<2^1og(l/5)<2W4, t < 2^2 < 2", , < Slog^ ^ 251o^^ 

which implies the Li norm is bounded by poly (1/(5). 
Sandwiching F3. We write 



Note that each i e S3 satisfies 1 - Pi > 2""', which implies that l^s] < 2"'+Mog(l/(5)). Let 
af = 2/2^^ > ^. Then, by Lemma 5.6, Es^u[gi{x)''] < {2kf^af and we have 



2-+ilog(l/5)) ^ ^ ^ 1 



22v - 23^5 - log(lM)25- 

By Theorem 3.1, F3 has 0(5) sandwiching approximations with Li norm bounded by (rat + l)^'^ 
where 

m < 2'^log(l/p) < 2^^/^ 
t < 2""/"^ < 2^', 

51og(l/J) ^ 251og(l/(5) 
-log(l/E.^n" 3^ ■ 
which implies the Li norm is bounded by poly (1/5). 

Sandwiching F. Since each Fj has 0(5) sandwiching approximations with Li norm poly(l/(5), 
by Theorem 4.1, F = F1F2F3 has 0(5) sandwiching approximations of Li norm poly(l/5). 
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Handling all values of E[/]. Finally, to get rid of the condition E[/] > 6, assume that E[/] < 5. 
If E[/] =0, / = so there is nothing to prove. If E[/] > 0, every co-ordinate fi has at least one 
satisfying assignment. We repeat the following procedure until the expectation exceeds 6: pick a 
co-ordinate i which is not already the constant 1 function and add a new satisfying assignment to 
i. Such a co-ordinate i exists because 6 < 1. We repeat this until we get a rectangle /* such that 
> 5. Denote the resulting sequence 

f = f <!'■■■< f- 

We claim that for every j, 

_E [f{x)] < E [f+\x)] <2_E [f{x)]. 

The last inequality holds since at each step, we at most double the acceptance probability of the 
chosen co-ordinate, and hence of the overall formula. Hence we have 

E[/*] < 2E[/*~i] < 26. 

We use the upper approximator for /* as the upper approximator for / and as the lower approx- 
imator. This gives sandwiching approximators with error at most 26 and Li norm poly(l/5). 

This completes the proof of the lemma. □ 

5.2 A Recursive Sampler for Combinatorial Rectangles 

We now use Lemma 5.3 recursively to prove Theorem 5.2. Our generator is based on a derandomized 
recursive sampling procedure which we describe below. The inputs are the width w and the size m 
of the rectangles we wish to fool and an error parameter 6 < 1 /2^ . 

1. Let vq = w, Vj = (I)"' w. 

2. While Vj > 501oglog(l/5) we sample Xj G {{ztlj^'j^^ }^ ^^"^ according to an ei-biased distri- 
bution for e < (l/6y^ for some large constant ci. 

3. Assume that at step t (where t = 0{logw)), vt < 50 log log(l/(5). Sample an input xt G 
({ibl}*'*-^)'" from an £:2-biased distribution where, for some large constant C2, 

£2 < (l/(^)'^2(loglog(l/(5) logloglog(l/(5)) 

We next describe how we use x = {xi,...,xt) to output an element of ({±1}"')'". For 
k € {l,...,t — l}we denote by Sk the recursive sampling function which takes strings Xj G 
{{±l}''j-i}2'''^™' for j G {k + 1,... ,t - 1} and Xt G ({±1}'")"' and produces an output string 
Skixk+i,. ■ ■ ,xt) G ({±1}^''=)'". Set st-i{xt) = Xt. Fix fc < t - 1 and let z = Sk+iixk+2, ■ ■ ■ ,xt) be 
already defined. To define s^, we will use z to look up entries from the matrix Xk+i, so that the 
i'th coordinate of will be the entry of xi^^i in the Zj'th row and z'th column: 

Sfc(x) = Sk{Xk+l,...,Xt) = {{Xk+l)zi,l,{xk+l)z2,2, ■ ■ ■ ,{^k+l)z„„m) S ({±1}"'=)"'. 

The above definition, though intuitive is a bit cumbersome to work with. It will be far easier 
for analysis to fix the input combinatorial rectangle / : ({±1}"')"^ {0, 1} and study the effect 
of the samplers on /. Let = f. Each matrix Xj gives a restriction of f^~^: it defines 
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restricted co-ordinate functions : {±1}^^ — ?> {0, 1} and a corresponding restricted rectangle 
P : {{±lp}"^ ^ {0, 1}. We only use the following property of the Sjs: 

/(so(x)) = (x)).../*-i(si_i(x)). (5.4) 

To analyze the last step, we use the following corollary that follows from [DETTIO]. 
Corollary 5.7. Every combinatorial rectangle f : {{±1}''}™ — > {±1} is 6-fooled by e-bias spaces 

Proof. Each co-ordinate function /j can be expressed as a CNF formula with 2'" clauses of width v. 
Hence we can write / as a CNF formula with m2'" clauses of width v. Now apply Theorem 2.8. □ 

For brevity, in the following let 

be the domain of x as defined in the generator construction. 

Let denote the distribution on U where Xi are sampled from an e-biased distribution for 
i < j and uniformly for i > j. Then, so{T)^) is the uniform distribution on {{±1}'^}'" whereas 
sq{'D^) is the output of our Recursive Sampler. 

Lemma 5.8. Let f : {{±1}"'}™ {0, 1} be a combinatorial rectangle with width w and size m. 
For distributions T>^ and defined above, we have 



E [/(so(x))]- E [/(so(x))] 



< 5. 



Proof. Let 5' = 5/t. We will show by a hybrid argument that for all j € {1, . . . , t} 



E_ [/(5o(x))]- E [/(.so(x))] 

x~XiJ-i x~X>J 



<6'. 



(5.5) 



In both D-'^^ and T>^ , Xi is drawn from an e-biased distribution for i < j, and from the uniform 
distribution for i > j. The only difference is xj which is sampled uniformly in T>^'^ and from an 
e-biased distribution in . 

We couple the two distributions by drawing Xj for i < j according to an e-biased distribution. 
By Equation (5.4), we get 

E^ J/(.o(x))] = E, [/^-Hs,_i(x))], E ,[/(.o(x))] = E ,[/^-n.i-i(x))] 



and our goal is now to show that 



E, [/^■"i(s,_i(x))] - E [/^■-i(.,-_i(x))] 



<5'. 



(5.6) 



Define the bias function ^ of the rectangle f^ ^ as in Equation (5.3). The string Xj defines 
a restricted rectangle f^ : {{±1}^^ }™ — ?• {0, 1}. Applying Claim 5.5 we get 

E [f^{z)]=F^~\xA. 
.~({±ip)'" 
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In both distributions ^ and T)^ , 2:^+1, ■ ■ ■ ,xt are distributed uniformly at random, hence Sj(T>^ ^) 
Sj{T>^) ~ ({±1}"-'')™' are uniformly distributed, and this variable is independent of Xj. So we have 



Sj-l{x))] 



E 
E 



E 

(Zj + l...,Xi)~I'J-l 



[f{Sj{Xj + i,...,Xt))] 



Thus it suffices to show that 



E 

{xj+i...,xt)^'D^ 



[f{Sj{Xj+i,...,Xt))] 



E^,[F^-"i(x,)] 



E 



<5' 



By Lemma 5.3, this holds true for j < t — 1 provided that ei < poly (1/(5'). 

For j = t, note that this is equivalent to showing that e2-bias fools the rectangle /* 
Corollary 5.7, /* is 5' fooled by e2-biased spaces where 



By 



£2 = 



m \ -0{vt logDi) 



0(loglog(l/5')Iogloglog(l/y)) 



□ 



Plugging these back into Equation (5.5), the error is bounded hy t ■ 5' < 5. 
To complete the proof of Theorem 5.2, we observe that the total seed- length is 

s = 0((logu;)(log(m2"'/ei) + log(m2"'/e2)) 
= O {logw (logm + w + log(l/(5)) + log(l/5) log log(l/(5) log loglog(l/5)) . 

We next state an application of our PRG to hardness amplification in NP. Say that a Boolean 
function / : {0, 1}" {0, 1} is (e, s)-hard if any circuit of size s cannot compute / on more 
than a 1/2 — £ fraction of inputs. The hardness amplification problem then asks if we can use a 
mildly hard function in a black-box manner to construct a much harder function. Following the 
works of O'Donnell [O'D04] and Healy, Vadhan and Viola [HVV04], Lu, Tsai and Wu [LTW07] 
showed how to construct (2~^("^''^\ 2"^("^''^))-hard functions in NP from (l/poly(n), 2^("))-hard 
functions in NP. Their improvement comes from using the PRG for combinatorial rectangles of Lu 
[Lu02] to partly derandomize the constructions of Healy, Vadhan and Viola. By using our PRG for 
combinatorial rectangles. Theorem 5.2, instead of Lu's generator in the arguments of Lu, Tsai and 
Wu immediately leads to the following improved hardness amplification within NP. 

Corollary 5.9. If there is a balanced function in NP that is {1 / poly (n), 2^^"'^) -hard, then there 
exists a function in NP that is (i/2"/P°iy{i°g"), 2"/P°i>'(i°g"))-/iard. 



6 HSGs for Read-Once Branching Programs 

In thsi section, we reduce the problem of constructing an HSG for width 3 branching programs to 
the problem of HSG construction for CNF formulas which are allowed to have parity functions as 
clauses. We start with some definitions. 

A read- once branching program (ROBP) B of width d has a vertex set V partitioned into n + 1 
layers Vq U ■ ■ ■ U where 

1. vo = mo)}. 
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2. = for t G {l,...,n-l}. 

3. Vn = {{n,l),{n,d)}. 

The vertex (0, 0) is referred to as the Start state, while (n, 1) and (n, d) are referred to as Acc and 
Rej, respectively. Each vertex in u € 14 has two out-edges labeled and 1, which lead to vertices 
Nq{v) and Ni{v) respectively in V^+i. We refer to the set of states {(t, as the top level and 

{(i, as the bottom level. 

A string x G {0, 1}"' defines a path in Vb x • • • x l^i beginning at Start and following the 
edge labeled Xi from Vi. Let Path(x) = Patho(a;), . . . , Path„(x) denote this sequence of states, i.e., 
Pathi(x) = (0,0), and Pathj_|_i(x) = iV^;- (Pathj(x)). The string x is accepted if Path„(a::) = Acc. 
Thus the branching program naturally computes a function / : {0, 1}" {0, 1}. Let E[/] = 
IEx~{o,i}"[/(a;)] = Pr,[/(x) = l]. 

Let BP(d, n) denote the set of all / : {0, 1}" — >■ {0, 1} that can be computed by width d ROBPs. 
Our hitting set generator for BP(3, n) uses a reduction to the problem of hitting CNF formulas 
where clauses can be disjunctions of variables or parity functions. 

Definition 6.1. Let CNF®(n) denote the class of read once formulas f : {0, 1}" — )■ {0, 1} of the 

form f = A^^^Tj where each Ti is either a disjunction of literals or a parity function of literals and 
the TiS are on disjoint variables. 

Theorem 6.2. For every f G BP(3, n) there is an integer k and g G CNF®(n — k) such that 
0^ o g-i(l) C and E[c/] > (E[/]/?i)0(i) . 

Given this reduction, we get a HSG for BP(3, n) by using the PRC for CNF® that we construct 
in Theorem 8.2: 

Theorem 6.3. For every e > 0, there exists an explicit (e, (e/n)*^'^^))-HSG G : {0,1}'' {0, 1}" 
for BP(3,n) with a seed-length o/ 0((log(n/e)) • (loglog(?i/e))'^). 

We remark that using similar techniques, we can also achieve a seed-length of 0((log n) (log(l/ e))) 
which is better than the above bound for large values of e. We defer the details of this to the full 
version. 

The reduction in Theorem 6.2 is carried out in three steps. 

• The first step (for the sake of HSGs) reduces arbitrary width 3 programs to "sudden death" 
width 3 programs, where the last state in every layer is a Rej state. (This step in fact works 
for all widths.) 

• The second step reduces "sudden death" width 3 programs to intersections of width 2 pro- 
grams. 

• The third step reduces intersections of width 2 programs to CNF® formulae. 
6.1 Reduction to Branching Programs with Sudden Death 

Definition 6.4. A width d BP with sudden death is a BP where the bottom level states are all 
Rej states. Formally this means NQ{{t,d)) = Ni{(t,d)) = {t + l,d) for all t = 1, ... ,n — 1. Let 
BP'^^j(d,n) denote the set of functions computable by such programs. 

We reduce the problem of constructing hitting sets for width d BPs to for ones with sudden 
death. 
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Theorem 6.5. For every f E BP(cl, n) there is an integer k and a g : {0, 1}" ^ — )• {0, 1}, g € 
BP'^^j(d, n) such that O'' o g~^{l) C /^^l) and E[g] > K[f]'^/2n. 

We first setup some notation. For a vertex v V let p{v) denote the probabihty of reaching 
Acc starting from V over a, uniformly ra,nd.oiii choice of x^-f-i, ■ • • ; Xji. We call a state v £ V such 
that p{v) = a Rej state. We order states in Vt so that 



By definition, 



It follows that 



>p((t,2))-- - >p((t,d)). 



p{v) = \{p{No{v))+p{N^{v))). 



E[/] = p((0, 0)) < !))<•••< 1)) = 1, 
> d)) > p{{2, d)) > • • • > p{{n, d)) = 
Observe that, if w € V^- is such that p{v) < /i, then p{{i, d)) < ^ for all i > j. 

Lemma 6.6. Let B G BP(cl, n). Let R he a set of states such that p{v) < fi\/v R and let j be the 
first layer such that RCiVj 7^ 0. Let B' be obtained from B by converting all states in R into Rej 
states by redirecting the edges out of v € ROVi, i > j, to {{i + l,d)). Letp'{v) denote the accepting 
probabilities of vertices in B' . Then for all v (zV , we have p'{v) > p{v) — fi. 

Proof. If p{v) < /i the claim is trivial, so fix v such that p{v) > fi. Let R{x) denote the event that 
we visit a vertex in R if we follow x from v in B and let u{x) denote the first vertex in R that is 
visited by this path. Let Acc(x) denote the event that B accepts. We have 

Fr[R{x) A Acc(a;)] = ^ Pr[u(x) = r A Acc(3;)] 

reif. 

= > Pr[u(x) = r] ■ Pr[Acc(a;)|u(3;) = r] < > Pi[u(x) = r] ■ < fx, 

^ — ' X X ^ ' X 

where we use Ptx[^cc{x)\u{x) = r] = p{r) < ^ for all r ^ R. But then 



Pr[Acc(x) A R{x)] = Pr[Acc(x)] - Pr[Acc(3;) A R{x)] > p{v) - fi. 

X XX 

Finally, note that if we accept x without ever reaching Rin B, then x is also accepted by B' . Hence 
p'{v) > p{v) — fi. □ 

Proof of Theorem 6.5. Let -B be a branching program computing a function / so that E[/] > e. 
Let i denote the first layer where p{{i,d)) < e/2. Note that i <n since p{{n,d)) = 0. Every state 
V up to layer i — 1 satisfies p{v) > e/2. Further, for every j > i, p{[i + 1, d)) < e/2. Fix k = i — 2 
and let v be the state in level i — 1 reached from Start on the string 0^'. Consider the branching 
program B' of length n' = n — k where we make v the new start state and keep the rest of the 
program unchanged. The vertex set of B' is V = {v} ^^=1 it computes /' : {0, l}" — >• {0, 1} 

such that 

IE ^[f'{y)]=p{v)>e/2. 
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Thus, a random walk starting at v reaches the top level with probability at least e/2 (since this 
is a necessary condition for B' to accept). For j € {i, . . . ,n — 1}, let q{j) denote the probability 
that we reach the top level for the first time at layer j. So 

n+l 
j=i 

Hence there exists j so that q{j) > e/2n. 

We now make the following modifications to B' to get a program B" which is a width d program 
with sudden death: 

• For t S {i, . . . , j — 1} we convert the states {t, 1) into Rej states. 

• For t e {j, . . . , n + 1} we convert the states (t, d) into Rej states. 

We don't need to add an additional layer for making these modifications since we are turning one 
state in each layer to a Rej state. 

It is clear that B" computes a function /" < /'. Our goal is to show that B" accepts a large 
subset of inputs accepted by B'. Indeed, we claim that 

E ,[f"{y)]>f. 

We observe that the probability that a random walk starting at v reaches the top level for the 
first time in layer j is the same in B" as in B' , hence it equals q{j) > e/2n. Further, using Lemma 
6.6 (to the sub-program of B' starting at (j, 1)) we claim that 

p"{j, 1) > p'U, 1) - e/2 > e/2 

where we use the fact that p'{j, 1) = p{j, 1) > p{l, 1) > £■ Note that the probability that B" accepts 
is at least q{j)p"{j, 1) > e^/4n, which comes from strings which reach state {j, 1) and then reach 
Acc. 

The theorem now follows by setting g = f". By definition, /" G BP'^^-'(d, n — k) and 

0^'o(r)-Hl)CO'=o(/')-l(l)C /-!(!). 

□ 



6.2 From BP^"j(3) to Intersections of BP(2) 

We now reduce width 3 programs with sudden death to intersections of width 2 programs. 

Theorem 6.7. Let f : {0,1}" {0,1} be in BP'^^j(3, n). Then, there exists a function g : 
{0, 1}" {0, 1} that is an intersection of functions in BP(2, n) such that g < f and if p = ]E[/], 
then E[g] > {p/2)^^. 

Throughout this section, we are given B £ BP'^^j(d, n) computing / : {0, 1}" ^ {0, 1}. Let Bad 
denote the set of non-reject states that have an out-edge leading to a Rej state (which are all states 
such that p{v) = 0). Further for each x S {0, 1}*^, let Bad(x) denote the number of Bad states 
visited by x. 
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Lemma 6.8. We have 

Pr [Bacl(x) > t] < 2"*+^ 

a;~{0,l}" 

Proof. Suppose that t > 1. For i G [n], let Yi denote the number of vertices in Bad visited by 
Path(x) in the first i layers. Then, Yn = Bad(x). We claim that, 

¥[Y, = y,+i = Y,+2 = ■■■ = Yn\ Yi, Pathi(x) G Bad] > 1/2. (6.1) 

This is because, if Pathj(x) G Bad, then with probability at least 1/2, Pathj+i(x) is a Rej state, 
in which case Pathj(2;) is a Rej state for every j > i + 1. 

Further, if Y^ > t, then there must be an index i < n, where Yi > t — 1, Pathj(3;) G Bad and 
Yn > Yi (for instance i can be the least j such that Yj = t — 1). Therefore, 

>t]=F[ {3i <n,Yi>t-l, Path,(x) G Bad) A (y„ > Yi)] 

= ¥[ {3i <n,Yi>t-l, Pathi(x) G Bad) ] • P[y„ > Yi \ Yi, Pathj(x) G Bad] 

<^-¥[{3i<n,Yi>t-l, Pathi(x) G Bad) ] < ^ ■ nYn > t - 1], 

where the last two inequalities follow from Equation (6.1) and the fact that l^'s are non-decreasing. 
The claim now follows by induction. □ 

Corollary 6.9. Let Pr^j^jo,!}" [/(^) = 1] = Then 

E [Bad(x)] = E [Bad(x)|/(2;) = 1] < 21og(2/p). 



Proof. We have 



rr, ^ , X n Pr^.[(Bad(x) > t) and (f(x) = 1)] 1 
Pr[Bad(a;) > t\f{x) = 1] = J ~ < 



Pr,.[/(x) = 1] - 2*-V' 

Let t* = log(2/p). We then bound 

E[Bad(x)|/(x) = 1] = y t . Pr[Bad(x) = t\f{x) = 1] 

t>0 

< r + y t • P[Bad(x) = t\f(x) = 1] 

^ ' X 

t>t* 

t>t* ^ 

= i* + l- {^^^ + ^) = 21og(l/p) + 2 < 21og(2/p). 



□ 



The rest of our argument is specific to d = 3. We restrict our attention to the accepting strings 
X G /~^(1). For each vertex z; G ^ let q{v) = Pr^^j-i^j^^ G Path(3;)]. Each layer t has three states 
(t, l),(t,2) and (t, 3) G Rej. We assume that g((i, 1)) > q{(t,2)) > q{t,3) = (since accepting 
strings never visit a Rej state). We first bound the probability mass on states in the set Bad. 



Lemma 6.10. We have 



^ qiv) = E [Bad(x)]. 

,;GBad 



27 



Proof. We have 

Eq(v) = > Pr \x visits v] = E [Bad(x)l, 

veBad veBad -i ^ ' J \ i 

by linearity of expectations. □ 
We partition the set Bad based on the value of q{v): 

Bad' = G Bad : qiv) < || . Bad' = j?; € Bad : q{v) > ^ 

By Lemma 6.10 and Corollary 6.9 it follows that |Bad^| < 81og(2/p). 



Lemma 6.11. We have 



Pr [Path(x) n Bad" = 0] > {p/2)^. 



Proof. Since for all t, q{{t, 1)) > 1/2 we have (t, 1) Bad*. Sort the vertices in Bad* according to 
layer, so that Bad* = {(ti, 2), . . . , {tw, 2)}. We have 



Pr 



: [Path (x) n Bad* = 0] = TT Pr [(t^, 2) ^ Path(x)|(ti, 2), . . . , 2) Path(x)]. 

1(1) iJi'-^^-f 



Note that if (ij„i,2) Path(x) then (ij„i,l) S Path(x). Hence conditioning on not visiting 
(ti, 2), . . . , 2) is the same as conditioning on visiting (ti, 1), . . . , 1). Further, conditioning 
on visiting (ti, 1), . . . , (ij-i, 1) is the same as conditioning on (tj-i, 1). Therefore, 

Pr [(ti,2) G Path(x)|(ti,l),...,(ti_i,l) G Path(x)] = Pr [(t,, 2) G Path(x)|(ti_i, 1) G Path(x)] 

^ PW-i(i)[(^»,2) G Path(x)] 
- Pr^^^-i(i)[(ti_i,l) G Path(x)] 
g(t»,2) 4 

because q(ti-i, 1) = 1 — q{ti^i, 2) > 3/4. Hence we have 

Pr [Path(x)nBad* = 0] = TTPr[(t„2) ^ Path(x)|(ti_i, 1) G Path(2;)] 

w 



f[{l - M!ll^) > e-2{Er=i'?((*.,2))) > (p/2)4 



i=l 

where we used the fact that for z < 1/4, (1 - 4z/3) > e'^^ and Et-eBad'' < 21og(2/p). □ 

We are now ready to prove Theorem 6.7. 

Proof of Theorem 6.7. Observe that by the above claim, we can replace vertices in Bad* by Rej 
vertices, and get a new program B' such that B' < B and W^B'] > p • (p/2)^ > (p/2)^. Lastly, we 
handle the vertices in Bad', which currently have transitions to Rej. Assume that these vertices 
are vi, . . . ,Vj and that they read variables , . . . , Xi- . There exists a fixing Oj^ , . . . , Oj^. of these 
variables such that the probability of acceptance of B' over the remaining variables is at least 
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{p/2f. Let B'{a) denote the program obtained by hardwiring these values in B' . Now consider 
the program B" = B'{a) A (xj^ = ajj A • • • A (xj^. = Oj^), then < i?' and 

nS"] > {p/2f ■ ^ > (p/2)i3, 

since |Bad'| < 81og(2/p). 

We only need to argue that B'{a) and hence B" is an intersection of width 2 branching pro- 
grams. Note that B'{a) is a width 2 program but with Rej states for every vertex in Bad'^ = 
{(ti, 2), . . . , (t^, 2)}. But we can view B'{a) as an intersection of branching programs B[ for 
i G {!,..., -u; — 1}, where B[ has start state and accept state This completes 

the proof of the claim. □ 

6.3 Reducing intersections of BP(2) to CNF® 

We now perform the final step in our sequence of reductions to prove Theorem 6.2. 

Theorem 6.12. Let f : {0, !}"■ — )■ {0, 1} be an intersection of width 2 BPs on disjoint sets of inputs, 
i.e., / = /i A/2 A • • • A/m, where each fi G BP(2, n). Then, there exists a CNF® 5 : {0, 1}" {0, 1} 
such that, g< f and E[g] > E[ff^^\ 

We use the following characterization of width 2 branching programs as decision lists due to 
Saks and Zuckerman [SZ95] and Bshouty, Tamon and Wilson [BTW98]. For a set S CI [n], let 
And(S') denote all functions of the form Aji/j where j & S and yj € {xj,Xj}. We define Or{S) and 
X0R(5) similarly. Note that all these classes contain the constant functions. 

Theorem 6.13 ([SZ95, BTW98]). Let f € BP(2) be computed by a read-once, width 2 branching 
program that reads variables xs for 5 C [n]. Then f is computable by a decision list of the 
following form. 

» Cf reads variables xy for some V <Z S of size k. 

• There are k+l leaves denoted Li, . . . L^+i, where Lj is labeled by a function Ij G XOR(S'\y)'^ 

We order V according to how variables are read by £j and use to denote the indices of the 
first j variables. The condition that x reaches Lj is given by a function in gj € And{V^). We say 
that Lj accepts x if gj{x) = 1 and ij{x) = 1 

We derive two consequences of Theorem 6.13. 

Lemma 6.14. Let f be as in Theorem 6.13. i/'E[/] > 5/6, then there exists g € OriV) such that 
g<fandng]>nff- 

Proof. Let E[/] = 1- e ioi e <\. Note that 

fc+i 

e = ^2-^Pr[£j(x) =0] 
i=i 

Consider the smallest j such that lj is not the constant 1 function. Since £j G X0R(5 \ V) and 
ij ^ 1, lj rejects with probability at least 1/2, hence e > 2~-^~^. 

^ A decision list is a decision tree where the left child of every node is a leaf labeled by one of the functions lj . On 
an input x, the output is computed by traversing the tree until a leaf is reached and outputting the value computed 
by the function at the leaf. 
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The condition that x reaches one of Li, . . . ,Lj-i is given hy g G OriV^ Since £i = £2 = 
• • • = we have that g < f and E[(/] = 1 — 2^^'^'^ > 1 — 4e. Since e < 1/6, the inequahty 

(1 - 4e) > (1 - holds. □ 

Lemma 6.15. Let f be as in Theorem 6.13. There exist hi € And(y) and /12 € XOR(S' \ y) such 
that if we define /i = /ii A /12 then h < f and E[/i] > E[/]/3. 

Proof. Let Lj be the highest leaf in £j which is not labeled 0. Set hi = gj and /12 = ij. It is easy 
to see that 

Pr[/i(x) = 1] = Pj[Lj accepts x] > ^Pr[/(a;) = 1]. 

□ 

We now prove Theorem 6.12. 

Proof of Theorem 6.12. Let f = fi A f2 A ■■■ A fm, where ft € BP(2, n). Let p = E[f]. Then, for 
I = {i : E[/i] < 5/6}, |/| < logQ/5(l/p). For i ^ I, let gi be the function obtained from Lemma 6.14 
and for i £ I, let hi be the function obtained from Lemma 6.15. Let g = (Ai^jgi) A (Ajg/Zii). Then, 
clearly g G CNF®, g < f and 

m = X{n9^] -Wm] > ll^[f^f ■ n inh]m > / • p • ^ > i'- 

i^I i&I i^I i&I 

□ 

6.4 HSG for BP(3,n) 

We now combine the previous sections to prove Theorems 6.2, 6.3. 

Proof of Theorem 6.2. Follows immediately from combining Theorem 6.5, 6.7, 6.12. □ 

Proof of Theorem 6.3. Let / G BP(3, n) with E[/] > e. Let g, k be as given by Theorem 6.2 applied 
to / so that E[g] >6 = [e/nf. Let G' : {0, 1}" {0, 1}'' be a PRC for CNF® with error at most 
5/2. By Theorem 8.2, there exists an explicit G' with seed-length s = 0(log(n/e) • (log log(n/e))^). 
Define, G : {0, l}i°g"+« ^ {0, 1}" as follows: 

• Sample r ~ [n] and y {0, 1}*. 

• Output r Os followed by the first n — r bits of G'{y). 
We claim that G is a {e, {e/ny+^)-HSG for BP(3, n). 

Assume that we guess r = k correctly, which happens with probability 1/n. G then simulates 
g on the string G'{y). Since, E,[g] > 6, 

P [g{G'{y)) = l] >E[g]-6/2>5/2. 
ye{o,i}= 



Therefore, 



Pr [f{G{y)) = 1] > 6/2n. 

k,yG{0,iy 



The theorem now follows. □ 
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7 PRCs for Read-Once CNFs 



We construct a PRG for read-once CNFs (RCNFs) with a seed-length of 0((logn) • (log log n)^) and 
error l/poly(n). As mentioned in the introduction, previously, only generators with seed-length 
0(log2 n) were known for error l/poly(n). Besides being of interest on its own, this construction 
will play an important role in our HSG for width 3 branching programs. Our main construction 
and its analysis are similar in spirit to what we saw for combinatorial rectangles and will be based 
on Theorem 3.1. 

Theorem 7.1. For every e > 0, there exists an explicit PRG G : {0, 1}'' — )■ {±1}" that fools all 
RCNFs on n-variables with error at most e and seed-length r = 0((log(n/e)) • (loglog(n/e))'^). 

The core of our construction will be a structural lemma that can be summarized as follows: 
The bias function of a random restriction of / where each variable has a small constant probability 
of being set has small Li-norm sandwiching approximators. 

Along with the structural lemma we shall also exploit the fact that for any RCNF, randomly 
restricting a constant fraction of the inputs simplifies the formula significantly: with high probability 
a size m RCNF upon a random restriction has size at most poly(logn) • m'^ , where 7 < 1 is a fixed 
constant. Theorem 7.1 is then proved using a recursive construction, where we use the above 
arguments for O(loglogn) steps. 

7.1 Sandwiching Approximators for Bias Functions 

For a function / : {±1}" [0, 1], a subset / C [n] and x G {±1}^, define fi{x) : {±1}^ [0, 1] by 

fj{x)= E ^^^\f{xoy)], 

where x o y denotes the appropriate concatenation: (x o y)j = Xj if i € / and (x o y)^ = \i i ^ I . 
We call // the "bias function" of the restriction (x,/). 

We will show that for a RCNF /, and / chosen in an almost /c-wise independent manner, the 
bias function // has small Li-norm sandwiching approximators with very high probability (over the 
choice of /). 

Lemma 7.2 (Main). There exists a constant a and c > such that the following holds for every 
£ > and6 < {e/nf. Let f : {±1}" {0, 1} he a RCNF and I ~ V{a,5). Then, with probability 
at least \ — e, fi has e-sandwiching approximators with Li-norm at most 

L{n,e) = (n/e)'=(^°siogW£))2 ^ 

Proof Let / = Ci A C2 A • • • A Cm- By abuse of notation, we will let Cj denote the set of variables 
appearing in Ci as well. In our analysis we shall group the clauses based on their widths. Let 
/3 = 1 + 1/6. 

We first handle the case where / has bias at least e, i.e., P[/(x) = !]>£■ Let Wi = 
c\ log log(n/e) and Wu = log2 m for ci a constant to be chosen later. Let fi be the RCNF containing 
all clauses of width less than and fu the RCNF containing all clauses of width at least Wu- Let 
T = logp{Wu/2Wi). For w e Wb = {[We/]''] : < r < T}, let be the RCNF containing ah 
clauses with width in [w,l3w). Then, 

f = fe ^ {^u.eWBfu,) A /„. (7.1) 
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We will show that each of the functions fi^fw^fu have good sandwiching approximators. We 
then use Theorem 4.1 to conclude that / has good sandwiching approximators. The claim for 
follows immediately from Theorem 2.7. The main challenge will be in analyzing (the analysis 
for fu is similar). To show that has good sandwiching approximators, we shall appeal to 
Theorem 3.2. 

Observe that as / has bias at least e, the number of clauses of width at most w in / is at most 
2"" log(l/e). We will repeatedly use this fact. Let ei = e/poly(n, to be chosen later. 



Sandwiching As each clause in fn has width at most W^, the number of clauses in fn is at most 
mi < 2^Mog(l/e) < (log(n/e))'^i+^. Thus, by Theorem 2.7, fe has ei-sandwiching approximators 
of Li-norm at most m^^^'^^^^^'^^^^ = {log{n/e))^^^°^^^^^^^\ As Li-norm does not increase under 
averaging over a subset of the variables, it follows that {fe)i has ei-sandwiching polynomials with 
the same Li norm bound: 

{fe)i has ei-sandwiching approximators with Li-norm at most (log(n/e))^(i°g(i/^i)). (7.2) 

Sandwiching Fix a w € Wb- Note that fuj has < 2'^"' log(l/e) clauses. Without loss of 
generality, suppose that = Ci A C2 A • • • A Cm^ ■ Let / ~ 'D{a, 5). 

Let J C [m^,] be the set of all 1700^ clauses, J = {j ■ \Cj n/| < w/S}. Decompose fw = f'w^ f'wi 
where = Aj^jCj. We first show that (Z^)/ has good sandwiching approximators. We then show 
that has a small number, poly(log(n/e)), of clauses with high probability over /. The intuition 
for the first step is that if each \Cj fl I\ is small, then the randomness in the remaining variables 
damps the variance of the bias function // enough to guarantee existence of good sandwiching 
approximators via Theorem 3.2. For the second step, intuitively, as I picks each element with 
probability at most a, we expect \Cj n/| to be about a\Cj\ < a{l3w) <^ w/3. Thus, the probability 
that I Cj n /j is more than w/3 should be small so that the total number of bad clauses is small 
with high probability. 

For brevity, suppose that = Ci A • • • A Cm' and let Wj = \Cj \ € [w,l3w). For x G {±1}^, and 
j e K], define g' : {±1}^ ^ [-1, 1] by 




1/2" 



if X satisfies Cj I 
otherwise 



Then, for Pj = I - 1/2" 



ifMx) = n ((1 - 1/2-0 -^K-)) = n ip^-am = (up] • n (1 - ^) • 



Let gj{x) = g'Ax)/pj. Then, 



By expanding the above expression we can write {f'w)i{x) = X^^i CkSkioi-, ■ ■ ■ ,9m') where the 
coefficients are at most 1 in absolute value. We will show that gi, - ■ ■ ,gm' satisfy the conditions 
of Theorem 3.2. 



32 



Clearly, gi,. . . ,gm' are on disjoint subsets of x. Note that g'jix) G [—1/2^"'/^, 1/2^"'/^]. Hence, 
as pj > 1/2, gj{x) £ [-a, a] for a = 2/22«'/3. Now, as -u; > ci loglog(l/e), m' < 2^"'log(l/e) < 
2(/3+i/ci)«;_ Thus, for ci > 12, 

2 1 

^ ~ 22u,/3 - (^^)l/2+l/12 ■ 

Finally, note that each has Li-norm at most 2. This is because any clause, and hence g'j, has 
Li-norm at most 1. Therefore, the functions gj satisfy the conditions of Theorem 3.2. Thus, 

{f^)l has £1 -sandwiching approximators with Li-norm at most poly(l/ei). (7-3) 

We are almost done, but for (/^). We will show that with high probability over /, (/^) has 
0(log(n/e)) clauses. To do so we will follow a standard argument for showing large deviation 
bounds using bounded independence. 

For i S [w] and j S [Tn^], let Xij be the indicator variable that is 1 if the variable corresponding 
to the i'th literal in the j'th clause of fw is included in / and otherwise. Let 

X = Sk{ S^/^{Xii,X2l, . . . ,Xwl), . . . , S^/3{Xim^,X2m^, . . . , Xu,rn^) ) . 

Then, for any k, 

¥[size{0>k]<E[X]. 

To see this observe that whenever size{f^) > k, X is at least 1. Let us first calculate this expectation 
when the variables Xij are truly independent. In this case, as niyu < 2"'log(l/e), and each clause 
has at most /3w variables. 



Therefore, for a = 1/32 and k = C2 max(log(n/e)/i(;, 1) for C2 sufficiently large, E[X] < e/2n. 
Now, as the actual variables Xij are (5-almost independent, and the polynomial defining X has at 
most (™™) • (^/^) terms, the expectation for I ~ 2? can be bounded by 



E[X] < + 6 ■ . f ~) = ^ + 5 . 2«(-'=) < i. 



21ogn V ^ / X'^/'^J 2n n 

for 5 < {e/nY for c a sufficiently large constant. Combining the above equations and applying 
Theorem 2.7, we get that with probability at least 1 — e/n, (/^) has ei-sandwiching polynomials 
with Li-norm at most 

^0(iog(iM)) ^ (iog(n/e))0{i°s{V-i)). 
Therefore, from Equation (7.3) and Theorem 4.1, with probability at least 1 — e/n, 

ifw)l has 0(ei)-sandwiching approximators with Li-norm at most (log(n/e))^('°s(i/ei))_ (7 4) 
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Sandwiching /„: A careful examination of the argument for reveals that we used two main 
properties: there are at most 2^"' log(l/e) clauses in and every clause has length at least w. Both 
of these are trivially true for /„ with w = Wu- Thus, the same argument applies. In particular, 
with probability at least 1 — e/n, 

{fu)l has 0(ei)-sandwiching approximators with Li-norm at most (log(n/e))^('°s(^/^i». (7.5) 

Now, observe that 

fi{x) = {fe)i{x) ■ {U)i{x) ■ n (/-)/(^)- 

Therefore, we can apply Theorem 4.1. In particular, by Equations 7.2, 7.4, 7.5, and a union bound, 
for b = \Wb\ +2 = O(loglogn) we have: with probability at least 1 — e, // has (16^ei)-sandwiching 
polynomials with Li-norm at most 

4' ■ ((log(n/e))^(^^°s(^/^^))) = (i/e,)0(aoglog{n/.))2) _ 
The lemma now follows by setting ei = e/n'^^^\ 

Handling Small Bias Case. We now remove the assumption that E[/] > e. Suppose E[/] < e. 
Consider the formula /' obtained from / by removing clauses in / until the first time exceeds 
e. Then, f < f and e < < 2e (as each clause has probability at most 1/2 of being false). 

We can use the upper approximator for /' as an upper approximator for / and constant zero as a 
lower approximator. This completes the proof of lemma. □ 

7.2 Restrictions Simplify RCNFs 

We next argue that for restrictions (x, /) where (x, /) are chosen from almost-independent distribu- 
tions as in the previous section, RCNFs simplify significantly and in particular have few surviving 
clauses with very high probability. 

Let / ~ T^{c(, (5) be as in Lemma 7.2 and x ~ P be chosen from a 5i-biased distribution with 
6i = l/poly(n). We will show that fixing the variables in / according to x will make the number 
of clauses drop polynomially. Let a, (3 be the constants from Lemma 7.2. 

Lemma 7.3. There exists constants C2,7 > such that the following holds for 5^ 6i < {e/nY^. Let 
I ~ 'D{a, 5) and x ^ V where T> is a 6i-biased distribution on {±1}". Let f : {±1}" — > {0, 1} be a 
RCNF with E[f] > e. Let g : {±1}["1\^ {0, 1} be the RCNF obtained from f by fixing the variables 
in I to X. Then, with probability at least 1 — e over the choice of {x,L), g is a RCNF with at most 
{log{n/e)Y^ ■ vn}~^ clauses. 

Proof. As in the proof of Lemma 7.2, we shall do a case analysis based on the width of the clauses. 
Let f£,fw,fu and Wb be as in Equation (7.1). Note that the number of clauses in fi is at most 
2^*log(l/e) = poly(log(n/e)). We will now reason about each of the fw^s for w € Wb. The 
argument for is similar and is omitted. 

Let fw have clauses, where > 81og(l/e), otherwise there is nothing to prove. Without 
loss of generality, suppose that = Ci A C2 A • • • A Cm^ and vuj = \Cj\. Let Yj be the indicator 
variable that is 1 if Cj survives in g (i.e., is not fixed to be true) and otherwise. We first do the 
calculations assuming that the variables in x and / are truly independent and later transfer these 
bounds to the almost independent case. 
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Observe that F[Yj = 1] = (1 - a/2)'^J < (1 - a/2)"'. Let M,k < M he parameters to be chosen 
later. Then, as in the proof of Lemma 7.2, 



M 



<E[Sk{Y,,...,YmJ] < 



rriu 
k 



Here, the first inequality follows from observing that if ^ ■ Yj > M, then ^^(li 



wk 



, Ym2 ) is at least 



fM 



) . Therefore, 



Now, setting M 



Y,Y,>M 



< 



ru \ wk 
1-- 

2, 



1-7 



< 2^"'log(l/e), it follows that 



(elog(l/e)) for a sufficiently small constant 7 and using the fact that 



Ey,,>Ml<((^^r<2- 



■n{wk) 



for 7 a sufficiently small constant. Thus, for k = C3 max(log(n/e)/it;, 1) and C3 a sufficiently large 
constant, > M] < e/2n. Now, as in the proof of Lemma 7.2, transferring the above 

calculations to the case of almost independent distributions only incurs an additional error of 



err 



{S + 5: 



).("^^-).2'^- = (5 + 5i).poly(n,l/e). 



Therefore, for 5, 5i < {e/nY for a sufficiently large constant c', we get ^[Ylj > M] < e/n. 

Hence, by a union bound over w > Wg, with probability at least 1 — e, the number of surviving 
clauses in g is at most 



size{f i) + {elog{l/e)) ■ "^L ^ < Poly{\ogn) + {e\og{l/e)) ■ \B\ 

wGWb 



(E 



< 



poly(logn) + (elog(l/e)) • l-BP • , 
where the first inequality follows from the power-mean inequality. The claim now follows. □ 

In our recursive analysis we will also have to handle RCNFs that need not have high acceptance 
probabilities. The following corollary will help us do this. 

Corollary 7.4. Let constants C2,7 and 6, 61, 1 ~ 'D{a,S),x D be as in Lemma 7.3. Let f : 
{±1}" ^ {0, 1} be a RCNF, and let g : {±1}W\^ ^ {0, 1} be the RCNF obtained from f by fixing 
the variables in L to x. Then, with probability at least 1 — e over the choice of {x,L), there exist 
two RCNFs ge,gu of size at most (log(?i/e))'^2 • m^~^ such that ge < g ^ g-u cmd ^[gu] — ^[ge] < 

Proof. If E[/] > e/2, the claim follows from Lemma 7.3. Suppose E[/] < e/2. Let /' be the formula 
obtained from / by throwing away clauses until E[/'] exceeds e/2. Then, e/2 < E[/'] < e. Let 
g' be the RCNF obtained from /' by restricting the variables in / to x. The claim now follow by 
applying Lemma 7.3 to /' and setting ge = and gu = g' . □ 
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7.3 A Recursive PRG Construction for RCNFs 



We now use Lemmas 7.2 and 7.3 recursively to prove Theorem 7.1. The main intuition is as fohows. 

Let £ = l/poly(n). Lemma 7.2 ensures that with high probabihty over the choice of I, // is 
fooled by small-bias spaces with bias ) which can be sampled from using 0((log n)(log log 

random bits. Note that / can be sampled using O(logn) random bits. 

Consider any fixed / C [n] and x € {±1}^. We wish to apply the same argument to f[x,i) '■ 
{±1}["1V {0, 1} to pick another set h C [n] and xi G {±l}^i and so on. The saving factor will be 
that most of the clauses in / will be determined by the assignment to x. In particular, by Lemma 7.3, 
with probability 1 — l/poly(n), f[x,i) has at most 0{n^~'^) clauses. By repeating this argument for 
t = O^(loglogn) steps we will get a RCNF with at most poly(logn) clauses, which can be fooled 
directly. The total number of random bits used in this process will be 0((log ?i)(log log n)'^). 

Fix e > and let constants a, c be as in Lemma 7.2. Let T>{a, 5) be a (^-almost independent 
distribution on 2^"^ with bias a. Finally, let P((5i),P(^2) denote 5i-biased and (52-biased distribu- 
tions on {il}*^ respectively for 81,82 to be chosen later. Let T = Clog log n for C to be chosen 
later. Consider the following randomized algorithm for generating a string z G {±1}". 

• For t = 1, . . . , r, generate independent samples , . . . ^ T^i^i) and Ji, . . . , Jt ~ T^{oi, 8). 

• Let Ii = Ji and It = Jt\ (u*"^]^/,.) for 2 < t < T. This is equivalent to sampling It from a 
(5-almost independent distribution with bias a from the set of subsets of as yet "uncovered" 
elements [n] \ U*~\/r. 

• Let X* = {z^)if This is equivalent to sampling x* using a 5i-biased distribution over {±1}^'. 

• Let / = uf^^It and x = ox^ o • • • ox"^ € {±1}^ be the appropriate concatenation: for i £ I, 
(xi) = (x*)i if i € It. 

• Let y T>{82)- The final generator output is defined by 

G{z^, . . . , z^, Ji, . . . , Jt, y) = z, where = Xj if i G / and Zi = yi otherwise. (7.6) 



To analyze our generator we first show that the restriction (x,/) preserves the bias of RCNFs. 
Let L{n, e) be the bound from Lemma 7.2. 

Lemma 7.5. For x,I defined as above, with probability at least 1 — eT over the choice of I, for 
every RCNF / : {±1}" ^ {0, 1}, 



E[/,(x)]- E [fi{y)] 
^ y&u{±lY 



< 61 ■L{n,£) ■T + 2£T. 



Proof. We will prove the claim by a hybrid argument. For j < T, let y^ ~ {±1}^-' and let T)^ denote 
the distribution of x^ o x^ o ■ ■ ■ o x-^ o y^~^^ o • • • o y^(the concatenation is done as in the definition 
of x). Note that D-'"^ and T>^ differ only in the j'th concatenation element, x^,y^. Further, is 
uniformly distributed on {±1}^ and is the distribution of x. We will show that with probability 
at least 1 — e, over the choice of /, 



E, ifjia)] - E , [fi{a) 



< 61 • L{n, e), 
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We couple the distributions T)^ ^ and by drawing j;* for i < j and let P = Ur<jlr- Now, as 



, y'^ are chosen uniformly at random, 



o ^ o x^] 



E, [fj{a)]=E[fj,, 
E [fi{a)]=E[fj,{x^o...ox^-^ 



Consider any fixing of the variables x^ , . . . , x^^^ and Ii, . . . , and let g : {±l}M\Ur<j-fr _^ 
{0,1} be the RCNF obtained from / under this fixing. Then, by Lemma 7.2, gj. is fooled by 
small-bias spaces: with probability 1 — e over the choice of Ij, 



W.\gj^{x^)]-¥.[gj^{y^] 



< 5i ■ L{n, e). 

Combining the above three equations, we have with probability at least 1 — e over Ij, 



E, [fi{a)] - E , [Ma)] 



13 



E 



[X o 



ox'-' o x^] 



13 



[X O 



OX-' "'^ o 



x'^....,x-'^-'- x3 



E 



< E 



E [gj^{x^)]-E[gi^{y^)] 

yJ 



XJ 



< 6i ■ L{n, e). 



The claim now follows by taking a union bound for j = 1, . . . , T. 



□ 



We are now ready to prove our main PRG construction. The idea is to combine Lemmas 7.3, 
7.5. For (x, /) chosen as in Lemma 7.5 we do not change the bias of the restricted function, on 
the other hand by iteratively applying 7.3 we can show that the resulting restricted RCNF has 
(logn)'^('°s^°^"^ clauses and hence is fooled by ri~'^((^°^'°s") ^-biased distributions. 

Proof of Theorem 7.1. Let I,x,y,z be as defined in Equation (7.6). Fix a RCNF / : {±1}" — t- 
{0, 1}. Let g : {ibl}!"]^-'^ — )• {0, 1} be the RCNF obtained by from / by fixing the variables in / to 
X. Let /' = [n] \ I. Note that 

fj{x)= E [g{y')]. (7.7) 

We next argue that g is fooled by small-bias spaces with high probability over the choice of 
X, I. Observe that g can be viewed as obtained from / by iteratively restricting / according 
to (x^, /i), (x^, I2), . . . , (x^, It) and all of these are independent of one another. Therefore, by 
Corollary 7.4 and a union bound, with probability at least 1 — e ■ T, g has 0(er)-sandwiching 
RCNFs gi,gu of size at most 

M = (log(n/e))'=2T . ^(1-7)^ = (log(n/e))^('°s'°s"), 

for T = Clog log n and C a large constant. Hence, by Theorem 2.7, g£,gu are e-fooled by (52-biased 
distributions for 62 = M~'^^^°^^^/'^^\ As gi,gu sandwich g, it follows that g is 0(eT)-fooled by 
52-biased distributions. As the above is true with probability at least 1 — eT over the choice of 
(x,/), by taking expectation over {x,I) we get (y is (52-biased) 



E 

xJ 



E[giy)]- E Jg(yO] 

y s/'~{±i}^' 



0{eT). 



(7i 
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Combining Equations 7.7, 7.8, we get 



nf{z) = 1] = E ng{y)] 
x,l y 



Finally, note that for any / C [n]. 



E 

xJ 



E My') 



± 0{eT) 



¥.\fi{x)]±0{eT). 

xJ 



^ P [f{z') = 1] = E [fi{x')]. 

2:'~{±1}" x'~{±l}I 

Combining the above two equations with Lemma 7.5, we get 



F[f{z) = 1] - P [/(/) = 1] 

2'~{±1}" 



< 



xJ X'r^{±iy 



+ 0{eT) 



<6i- L{n,e) + 0{eT). 



Therefore, by setting 5i = e/L the above error is at most 0{eT). The number of bits used by 
the generator is 

T (bits needed for x^,Ii) + (bits needed for y) = T • 0(logn + log(l/5i)) + 0(logn + log(l/(52)) 

= 0((log(n/e))-(loglog(n/e))3). 



The theorem now follows by rescaling e = e'/c'(log log n) for a large constant c' . 



□ 



8 A PRG for CNF® 

We construct a PRG for the class of CNF®. The generator will be the same as in Theorem 7.1. The 
analysis will also be similar and in fact follow easily from Theorem 7.1. To do this, we shall use 
the following simple claim. 

Lemma 8.1. Let f : {±1}" — t- {0, 1} be a conjunction of parity constraints on n variables. Then, 
f has Li-norm at most 1. 

Proof. Let ^i, 52, . . . , Sm. be the subsets defining the parity constraints in /. Then, 



The lemma now follows. 



□ 



Theorem 8.2. For every e > 0, there exists an explicit PRG G : {0, 1}^ {±1}" that fools all 
CNF® formulas on n-variables with error at most e and seed-length r = 0((log(n/e))-(log log(n/e))'^). 

Proof. Let G be the generator from Theorem 7.1. We will show that G fools CNF® as well. This does 
not follow in a black-box manner from Theorem 7.1, but we will show analogues of Theorem 2.7, 
Lemma 7.2 and Lemma 7.3 hold so that the rest of the proof of Theorem 7.1 can be used as is. 

Let / : {±1}" — ^ {0,1} be a CNF® of size m. Let f = g A h, where g has all the parity 
constraints of / and h the clauses. 
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First observe that by Lemma 8.1 and Theorem 2.7, a similar statement holds for /. Let Pi, 
be the e-sandwiching approximators for h as guaranteed by Theorem 2.7. Then, P^' := g ■ Pi, 
P!^ = g ■ Pu are e-sandwiching approximators for / and the Li-norm of P^' (P^) is bounded by the 
Li-norm of Pg {Py) by Lemma 8.1. 

Note that for any subset I Q [n], gi : {±1}^ — )• {0,1} is a constant function. Therefore, 
fi{x) = ci •hi{x), where c/ < 1. Thus, by applying Lemma 7.2 to h, we get an analogous statement 
for /. 

Finally, we show an analogue of Lemma 7.3. Suppose that / has acceptance probability at least 
e. Then, g has at most log2(l/e) clauses. Therefore, by Lemma 7.3 applied to h, we also get a 
similar statement for / with a slightly worse constant of C2 = C2 + 1. By arguing as in the proof of 
Corollary 7.4, we get a similar statement for /. 

Examining the proof of Lemma 7.1 shows that given the above analogues of Theorem 2.7, 
Lemma 7.2 and Lemma 7.3, the rest of the proof goes through. The theorem follows. □ 
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